Capacity Bound-free Web Warehouse

نویسندگان

  • Yahiko Kambayashi
  • Kai Cheng
چکیده

Web cache technologies have been developed as an extension of CPU cache, by modifying LRU (Least Recently Used) algorithms. Actually in web cache systems, we can use disks and tertiary storages since access time of disks (or even online tapes) is still shorter than time required for retrieving web pages from origin severs. Thus, we can remove the restriction of cache size that has been the most severe condition for designing cache algorithms. We still need to determine the priority of data for efficient processing. In this paper, the concept of Capacity Bound-free Web Warehouse (CBFWW) will be introduced. There are the following assumptions in conventional web cache systems. (1) Priority of each object corresponding to web contents is simply determined by using queue. (2) Each object is independent. (3) Transparency of cache is assumed, where a user cannot know the contents. As the communication speed of the web is very slow, we can use complicated algorithms to determine the priority. Priority of newly retrieved documents is determined by computing their similarities to the documents with known priorities. Although objects with high priority are usually stored in fast access storage, we will discuss how to handle large objects. In conventional database systems, usage information like priority is hidden to the users. As in web-based applications, only a small fraction of data becomes hot spot and hot spots are changing very rapidly, self-organizing property using dynamically changing priority is important. As web data are very much complicated, we have to handle mutually related data objects. Links and structures of documents are Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 2003 CIDR Conference also factors to determine the priority. As a large amount of data is stored, we should use the contents like database systems, not like conventional cache systems. Therefore transparency is not assumed. We need mining functions to analyze the usage patterns. We can retrieve objects with usage data which were not possible by cache or database system. The whole systems should have functions of cache, databases and data warehouses. We can further add useful functions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The optimal warehouse capacity: A queuing-based fuzzy programming approach

Among the various existing models for the warehousing management, the simultaneous use of private and public warehouses is as the most well-known one. The purpose of this article is to develop a queuing theory-based model for determining the optimal capacity of private warehouse in order to minimize the total corresponding costs. In the proposed model, the available space and budget to create a...

متن کامل

An Improvement in Cation Exchange Capacity Estimation and Water Saturation Calculation in Shaly Layers for One of Iranian Oil Fields

Water saturation and cation exchange capacity are the most significant parameters used to calculate a hydrocarbon zone potential. In clean formations, by applying the famous Archie model, which assumes that in the formation the only electric conductor is the formation water, the water saturation can be calculated. Additionally, in shaly sand formations this assumption may not be true as the ion...

متن کامل

The Vehicle-Routing Problem with Delivery and Back-Haul Options

In this article we consider a version of the vehicle-routing problem (VRP): A fleet of identical capacitated vehicles serves a system of one warehouse and N customers of two types dispersed in the plane. Customers may require deliveries from the warehouse, back hauls to the warehouse, or both. The objective is to design a set of routes of minimum total length to serve all customers, without vio...

متن کامل

Change Detection and Maintenance of an XML Web Warehouse

The World Wide Web contains a huge and increasing volume of information. The web warehouse is an efficient and effective means to facilitate utilization of information on the Web, not only to individual users but also to business organizations, especially for decision-making purposes. On the other hand, XML has recently become the new standard for representation and exchange of data on the Web....

متن کامل

Enhanced Architecture of a Web Warehouse based on Quality Evaluation Framework to Incorporate Quality Aspects in Web Warehouse Creation

In the recent years, it has been observed that World Wide Web (www) became a vast source of information explosion about all areas of interest. Relevant information retrieval is difficult from the web space as there is no universal configuration and organization of the web data. Taking the advantage of data warehouse functionality and integrating it with the web to retrieve relevant data is the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003